-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎨Unlossy AI translation #479
Conversation
We change the text type of the ai-translate endpoint from `text` to `json` to be able to send a json object with the text associated with an id. The ai will return the translated text with the id. Thanks to that we can preserve totally the Blocknote json and replace only the text.
We improved the blocks translations. Only for translations: We are now preserving totally the Blocknote json structure like to avoid losing any information. We are not transforming the blocks in markdown anymore, we are adding "id" to the text type blocks, to identify them, we extract the text and send them to the ai service to translate them. The ai preserve the id, then we replace the text blocks with the translated ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice solution! When we discussed on Matrix I missed that we were purely talking about translations (not other AI operations). So, basically you extract all text strings from the document and translate those via an LLM. That's quite nice and I think your approach should work!
@@ -280,31 +310,17 @@ const AIMenuItem = ({ | |||
const editor = useBlockNoteEditor(); | |||
const handleAIError = useHandleAIError(); | |||
|
|||
const handleAIAction = async () => { | |||
const handleAIAction = () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious; was the async
removed intentionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed all the await
so async
was not needed anymore, I have linters that warn we about these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, was wondering as there's a then
below there's not much of a change.
@@ -0,0 +1,145 @@ | |||
import { Block as CoreBlock } from '@blocknote/core'; | |||
|
|||
type Block = Omit<CoreBlock, 'type'> & { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this? I think you could use Block
from BlockNote directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it. It's possible for sure but would require a bit more work on the typings. If it's a priority we can pair on this
throw new Error('No response from AI'); | ||
} | ||
|
||
const markdown = await editor.tryParseMarkdownToBlocks(responseAI.answer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, non-translate AI actions like these will still be "lossy". I'll think of a blocknote-based solution for this (as part of BlockNote AI features)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this part is still lossy, I tried but I didn't have good AI results when it is with actions, the AI has difficulty to do the request when the text is too split:
https://github.com/numerique-gouv/impress/blob/f45e724b1797693841b9304c4d7dcc6939309bad/src/frontend/apps/impress/src/features/docs/doc-editor/api/useDocAITransform.tsx#L5-L9
Rephrase and summarize can not really work with this system I think, the AI will have the remove some ids.
I think it is less likely that users will select tables or complicated structures for these actions (to monitor).
@@ -62,7 +66,7 @@ def call_ai_api(self, system_content, text): | |||
response_format={"type": "json_object"}, | |||
messages=[ | |||
{"role": "system", "content": system_content}, | |||
{"role": "user", "content": json.dumps({"markdown_input": text})}, | |||
{"role": "user", "content": json.dumps({"answer": text})}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you keep using the answer keyword, although it is not the answer in this context but the input ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it feels weird than when we already have structured json output in the translate case, we still pass it to the model within the answer json.
We could get rid of the anwser
altogether such that we don't have another indentation layer in the json.
@@ -426,11 +426,12 @@ class AITranslateSerializer(serializers.Serializer): | |||
language = serializers.ChoiceField( | |||
choices=tuple(enums.ALL_LANGUAGES.items()), required=True | |||
) | |||
text = serializers.CharField(required=True) | |||
text = serializers.JSONField(required=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: wouldn't call this text in this context
Hello Antho, nice solution ! However I think there might be issues when the order of the words differ in the two languages. For example I tried translating "Je te donne mon livre", which should translate to "I give you my book".
However the answer from gpt-4o-mini is:
Here the bold word has been modified to give instead of you. However, If i ask llama3.2 the answer becomes:
where the sentence doesn't mean anything anymore. |
@arnaud-robin you requested my review here, but I think you're right about:
Good spot, this seems like a potential blocker and I missed that one earlier. I'm doing an experiment to keep markdown formatting across LLM responses, which might be a viable alternative and also makes it possible for operations other than "translate" (i.e.: rewrite, simplify, etc.). It's very much wip atm though |
Yes, I can see it is not the best so, even with gpt-4o-mini... I am still wondering if markdown is the good way to solve this problem. Curious to see how you do it with markdown @YousefED |
Purpose
To translate with the AI we were using lossy functions (
blocksToMarkdownLossy
/tryParseMarkdownToBlocks
), so we were loosing indentations, colors, background, lot of formatting. Some errors were popping up as well about tables loosing ids after the translation.Proposal
Demo
scrnli_ulUKPeGj2U7Hbw.webm